helpful comment
We thank the reviewers for their careful reading of our work and for their helpful comments
We thank the reviewers for their careful reading of our work and for their helpful comments. We will also clarify that the text in sections 2.1 and 2.2 In terms of experimental predictions, our work predicts the synaptic weights in the SFA circuit. One mechanism for implementing a quadratic expansion are so-called "Sigma-Pi units" (Rumelhart, Hinton and (Mel and Koch, 1990). In this case, the derivation proceeds exactly as laid out in the paper. Thank you for pointing out the typos.
The authors sincerely thank all the reviewers for their very constructive and helpful comments
The authors sincerely thank all the reviewers for their very constructive and helpful comments. Prune: 40% of the channels are pruned. Empirically, this decay policy works better than a constant one: 78.16% verses KD together with Dirac is provided in Table 3 and Table 4 instead (the KD(MSE)+Dirac column). ResNet-18 and plain CNN18 (naive training) is 77.92% and 77.44%, respectively.
Response to Reviewer 2: Thanks for your helpful comments
Response to Reviewer 1: Thank you for your supportive comments! We will fix the typos in the final version. "The proposed algorithm is a relatively standard extension of SG-HMC and SGLD. From the perspective of the design of our algorithm, we admit that our algorithm is an extension of SG-HMC. More importantly, the corresponding theoretical guarantees of our algorithm outperform the state-of-the-art. Q2: "The following articles might also be related..." A2: Thank you for pointing out these related articles. We will definitely cite and discuss them in the final version. "Why not show figures that compare these samples against some ground truth, for example, those obtained by HMC (which is feasible to obtain for GMM and ICA)?
We sincerely thank all the reviewers for their helpful comments
We sincerely thank all the reviewers for their helpful comments. A: The actual accuracy is 75.8%, In Fig.3, the curves are not simple linear regression and unknown to us. A: We set the magnitude of RandAugment to be 9 with a std as 0.5 in all networks. We found that "resolution and depth are The performance of TinyNets is about 0.3-3.8%
d072677d210ac4c03ba046120f0802ec-AuthorFeedback.pdf
We respond to the concerns point-by-point as below. Why distilling prioritized paths improves architecture rating? The more sufficient/full training of subnets leads to a more accurate architecture rating [6](Sec.4.3). The set used to train the matching network? We will revise the manuscript to make this point clearer.
We thank the reviewers for their careful reading of our work and for their helpful comments
We thank the reviewers for their careful reading of our work and for their helpful comments. We will also clarify that the text in sections 2.1 and 2.2 In terms of experimental predictions, our work predicts the synaptic weights in the SFA circuit. One mechanism for implementing a quadratic expansion are so-called "Sigma-Pi units" (Rumelhart, Hinton and (Mel and Koch, 1990). In this case, the derivation proceeds exactly as laid out in the paper. Thank you for pointing out the typos.
our responses to the comments. 4 Response to R1
We sincerely thank all reviewers for their valuable efforts and insightful comments. We thank R1 for the helpful comment. Following R1's insightful suggestion, we compared GEGL with an additional "ablation" We thank R1 for the opportunity to make the following clarifications. We thank R2 and R3 for mentioning an important point. R2's comment: the current literature fails to search for a molecule that is high-scoring and realistic simultaneously.